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The aim of this note is to present an elementary proof of a variation of Harris' 
ergodic theorem of Markov chains. This theorem, dating back to the fifties IIHar56l es- 
sentially states that a Markov chain is uniquely ergodic if it admits a "small" set (in a 
technical sense to be made precise below) which is visited infinitely often. This gives 
an extension of the ideas of Doeblin to the unbounded state space setting. Often this is 
established by finding a Lyapunov function with "small" level sets IHasSOl IMT93II . If 
the Lyapunov function is strong enough, one has a spectral gap in a weighted supremum 
norm I1MT92||MT93I| . In particular, its transition probabilities converge exponentially fast 
towards the unique invariant measure, and the constant in front of the exponential rate is 
controlled by the Lyapunov function |MT93 |. 

Traditional proofs of this result rely on the decomposition of the Markov chain into 
excursions away from the small set and a careful analysis of the exponential tail of the 
length of these excursions ||Num84| |Cha89l IMT92I [MT93I . There have been other vari- 
ations which have made use of Poisson equations or worked at getting explicit constants 
||KM051|D"MR04[|DMLM03| . The present proof is very direct, and relies instead on intro- 
ducing a family of equivalent weighted norms indexed by a parameter /? and to make an 
appropriate choice of this parameter that allows to combine in a very elementary way the 
two ingredients (existence of a Lyapunov function and irreducibility) that are crucial in 
obtaining a spectral gap. Use of a weighted total-variation norm has been important since 
MMT92I . 

The original motivation of this proof was the authors' work on spectral gaps in 
Wasserstein metrics. The proof presented in this note is a version of our reasoning in 
the total variation setting which we used to guide the calculations in IIHM08I . While we 
initially produced it for this purpose, we hope that it will be of interest in its own right. 



1. Setting and main result 

Throughout this note, we fix a measurable space X and a Markov transition kernel P{x,-) 
on X. We will use the notation P for the operators defined as usual on both the set of 
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bounded measurable functions and the set of measures of finite mass by 

{P^){x)^ f ^{y)P{x,dy) , {V^^){A)^ f P {x , A) fi{dx) . 
Jx Jx 

Hence we are using P both to denote the action on functions and its duel action on mea- 
sure. Note that P extends trivially to measurable functions X ^ [0, +00]. We first 
assume that P satisfies the following geometric drift condition: 

Assumption 1. There exists a function F : X — > [0, 00) and constants K > and 
7 e (0, 1) such that 

{PV)ix) <jV(x)+K , (1) 

for all a; e X. 

Remark 1.1. One could allow V to also take the value +cxd. However, since we do not 
assume any particular structure on X, this case can immediately be reduced to the present 
case by replacing X by {x : V{x) < 00}. 

Assumption [T] ensures that the dynamics enters the "center" of the state space reg- 
ularly with tight control on the length of the excursions from the center We now assume 
that a sufficiently large level set of V is sufficiently "nice" in the sense that we have a 
uniform "minorization" condition reminiscent of Doeblin's condition, but localized to the 
interior of the level set. 

Assumption 2. There exists a constant a G (0, 1) and a probability measure ly so that 

inf P{x, ■)>ai^i-) , 

with C = {x e X : V{x) < R} for some R > 2K/{1 - 7) where K and 7 are the 
constants from Assumption[T] 

In order to state the version Harris' theorem under consideration, we introduce the 
following weighted supremum norm; 

11^11= sup -^M^. (2) 

X 1 + V{x) 

With this notation at hand, one has: 

Theorem 1.2. If Assumptions\l]and^hold, then P admits a unique invariant measure 
fii,. Furthermore, there exist constants C > and 7 G (0,1) such that the bound 

\\p-^~^,.{^)\\<cry-^l.{^)\\ 

holds for every measurable function ip: X ^ Rsuch that \\ip\\ < 00. 

While this result is well-known, the proofs found in the literature are often quite 
involved and rely on careful estimates of the return times to small sets, combined with a 
clever application of Kendall's lemma. See for example I1MT93I Section 15]. 

The aim of this note is to provide a very short and elementary proof of Theorem |1.2| 
based on a simple trick. Instead of working directly with dU, we define a whole family of 
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weighted supremum norms depending on a scale parameter /3 > that are all equivalent 
to the original norm (|2]i: 

II II 

We also define the associated dual metric pp on probability measures given by 

P/3(Aii,/"2) = sup / ip{x)[pi - fi2){dx) . (3) 
It is well-known that pp is nothing but a weighted total variation distance: 

ppi^J'l,^J■2) ^ / {1 + (3V{x))\pi- p2\idx) . 

Jx. 

With these notations, our main result is: 

Theorem 1.3. If Assumptions\I]and\2\hold, then there exists a S (0, 1) and fi > Q so that 

Pl3{'Pl-ll,Vp2) < ap(3ipi,l^l2) 

for any probability measure /ii and p,2 on X. In particular, for any ao G (0, a) and 70 £ 

{-f+2K/R, 1) one can choose (3 ^ ao/Kanda = (l-(a-ao))V(2+i?/?7o)/(2+i?/3). 

Remark 1 .4. The interest of this result lies in the fact that it is possible to tune (3 in such a 
way that 7-" is a strict contraction for the distance p^. In general, this does not imply that 
P is a contraction for pi, say, even though the equivalence of the norms || • \\fj does of 
course imply that there exists n > such that is such a contraction. 



2. Alternative formulation of metric 

We now introduce an alternative definition of the weighted total variation norm p^. We 
begin by defining a metric dfj between points in X by 



di3{x,v) 



X = y 

2 + l3V{x)+ pV{v) x^y 

Though sightly odd looking, the reader can readily verify that since V > 0, dp indeed 
satisfies the axioms of a metric. This metric in turn induces a Lipschitz seminorm on 
measurable functions and a metric on probability measures defined respectively by 

IIMII. = sup^^M_^, 
x=£y dp(x,y) 



rf/3(Ml,M2)= sup Lp{x){p.i ~ P2){dx) . 

It turns out that these norms are almost identical to the ones from the previous section. 
More precisely, one has: 

Lemma 2.1. One has the identity |||(^|||/3 = infcgR ||</3 + c\\p- In particular, dp = pp. 
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Proof. It is obvious that |||<y9|||/3 < ||iy3||/3 and therefore fipfp < infcgR + c|[/j, so it 
remains to show the reverse inequality. 

Given any ^p with |||<p|||/3 < 1, we set c = inf^ (l + pV{x) — 'y9(a;)) . Observe that for 
any x and y, (p{x) < \(p{y) \ + Ifix) — fiy)] < \f{y) \ + 2 + l3V{x) + (3V{y). Hence 
1 + (3V{x) — ip{x) > — 1 — PV{y) — |'y3(t/)|. Since there exists at least one point with 
V{y) < oo we see that c is bounded from below and hence |c| < oo. 

Observe now that 

ip{x) +c< (p{x) + 1 + l3V(x) - ip(x) = 1 + f3V{x) , 

and 

(p{x) + c = inf ip{x) + 1 + l3V{y) - ip{y) 
y 

> inf 1 + pV{y) - \Mp ■ dp{x, y)>~{l + f3V{x)) , 
y 

so that \ f{x) + c\ < 1 + f3V{x) as required. 

It follows that the sets {(p : ||(p||/3 < 1} and {ip : |||(^|||/3 < 1} only differ by additive 
constants, so that one has indeed dp ~ pp. □ 

Remark 2.2. Note that of course dp = pp only for probability measures, or at least 
positive measures of equal mass. Otherwise, dp is +oo in general, while pp need not be. 

3. Proof of main theorem 

Theorem 3.1. If AssumptionsUland\2\hold there exists an a G {0,1) and /? > such that 

WV^Wp < a\Mp . 

Actually, setting 7o = 7 + 2K/R < I, for any S (0, a) one can choose (3 ~ Uq/K 
anda = (1 - a + Uq) W {2 + R/3jo)/{2 + RJ3). 

Proof. Fix a test function ip with |||<y9|||/3 < 1. By Lemma im we can assume without loss 
of generaUty that one also has \\ip\\p < 1. The claim then follows if we can exhibit a < 1 
so that 

\V(p{x) - V(p{y)\ < adp{x,y) . 
If X = y, the claim is true. Henceforth we assume x ^ y. We begin by assuming 
that X and y are such that 

V{x) + V{y) >R. (4) 

Fixing 7o as in the statement of the theorem, for any /? > we set 71 = (2 + /3i?7o)/(2 + 
PR). Observe that for /? G (0, 1) and R > 0, one has 71 G (70, 1). With these choices, 
we have from ([T]i and Q the bound 

iVfix) - Vfiy)] < 2 + prV{x) + PVV{y) 

< 2 + P"fV{x) + P"fV{y) + 2PK 

< 2 + PjoV{x) + PjoV{y) 

< 271 + p-/iV{x) + PiiV{y) = iidp{x, y) . 
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The third line follows from our choice of 70 and the fact that by ^ we know that 2K < 

(7o-7)(^(a^) + ^(2/))- The last line follows from the fact that2(l-7i) = /3i?(7i~7o) < 
Pill ~ 7o)(^(a;) + V{y)) given our choice of 71. We emphasise that up to now (3 could 
be any positive number; only the precise value of 71 depends on it (and gets "worse" for 
small values of /3). The second part of the proof will determine a choice of /? > 0. 

Now consider the case of x and y such that V{x) + V{y) < R and hence x,y C. 
For such X and y we define the Markov transition V by V{x, ■ ) = j^Vlx, ■ ) — 
jzr^i^i ■ )■ Now we have V(p{x) = (1 — a)'Pip{x) + ct J ^pdv and V(p{y) = (1 — 
a)Vif{y) + a / ifdv. Subtracting the second of these expressions from the first and using 
that since ^ is a non-negative function VV{x) < jz^Wlx) produces 

\Vip{x) ~ Vip{y)\ = (1 - a)\Pipix) - Vifiy)\ 

< (1 - a)2 + (1 - a)(3{rV{x) + rV{y)) 

< (1 - a)2 + f3{VV{x) + VV{y)) 

< (1 - a)2 + -il3V{x) + if3V{y) + 2PK . 

Hence fixing (3 = a^/K for any G (0, ct) and setting and 72 = (1 — (a — ao)) V 7 G 
(0, 1) produces 

\V'f{x) - Vip{y)\ < 2(1 - (a - a^)) + ^f3V{x) + -fPViy) 
< J2d0{x,y) . 

Setting a = 71 V 72 and recalling that 71 > 7 concludes the proof. □ 

Theorem 1 1.3 1 now follows as a corollary since dp — pp and dp is the norm dual 
to III • III/3. In order to conclude that Theorem 11.21 holds, it only remains to show that our 
assumptions imply that an invariant measure actually exists and that the integral of V 
with respect to is finite. 

3.1. Existence of an invariant measure 

We have already shown that Assumptions [T] and |2] allow to prove that for some /3 > 0, 
is a strict contraction in the weighted total variation metric pp defined by (|3]l. We now 
show that the same assumptions are also sufficient to ensure the existence of an invariant 
measure: 

Tlieorem 3.2. If Assumptions\l\and^hold then there exists a probability measure p^o on 
X such that J V dp^o < 00 and which is invariant in that Vpoc = /^oo- 

Proof. Fixing any a; G X, for n G N define pn ~ ^"i^x- By Theorem ll.3l we know that 
for some a G (0, 1) and some /3 > 0, 

pp{pn+l, Pn) < a"pp{pi,6x) ■ 

Hence, p„ is a Cauchy sequence. Since pp is complete for the space of probability mea- 
sures integrating V (because the total variation distance is complete for the space of mea- 
sures with finite mass) there exists a probabiUty measure p^o so that pp{pn, f^oo) ^ as 
n 00. Since this implies that pn Poo in total variation and V is always a contraction 
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in the total variation distance, it follows that V^ioo — limP/i„ = lim/^tn+i — as 



3.2. A slightly different set of assumptions 

Many results in the theory of Harris chains results are proved under a slightly different set 
of assumptions. The Lyapunov function condition in Assumption [T] is replaced with the 
following: 

Assumption 3. There exists a function y : X ^ [1, oo) and constants 6 > 0, 7 G (0, 1) 
and a subset 5 C X such that 



for all X G X. 

Clearly Assumption[3]implies Assumption [T] with K ^ b. The question is whether 
Assumption |2] holds with that choice of K and with C defined as in Assumption |2] If it 
does then our main theorem holds. However, Assumption[3]is most naturally paired with 
the following modified version of Assumption |2] 

Assumption 4. There exists a constant a G (0, 1] and a probability measure j> so that the 
lower bound 



holds. Here, the set S is the same as in Assumption[3] 

It is relatively clear that Assumptions [T]and|2]together imply Assumptions |3]and|4l 
In particular, if one picks a 7 G (7,1) sufficiently close to one, then R > K/ (7 — 7) and 
setting S = {x : V{x) < K} we see that the desired impUcation holds. 

Remark 3.3. In general, one cannot hope for Assumptions|4]and[3]to imply Assumptions[T] 
and[2]and hence the existence of a spectral gap without any further assumptions. A trivial 
example is given by X = {0, 1} with the (deterministic) transition probabilities 'P{x, ■) = 
Si^x- This Markov operator has spectrum { — 1,1} and has therefore no spectral gap. On 
the other hand. Assumptions |4] and [3] are satisfied with a = 1, 7 = 1/2, and b = 3/2 if 
one makes for example the choice 5* = {0}, i> = Si, and V{x) ^ 1 + x. 

In spite of the preceding remark, we are now going to show that Assumptions E] 
and [3] are essentially equivalent to Assumptions [T] and |2] from the previous section. More 
precisely, for TV > 0, define the "averaged" Markov operator 



required. 



□ 



{VV){x) <jV{x)+bls{x) , 



(5) 



inf Vix, ■)>aD{-) 




Then we have: 



Theorem 3.4. IfV satisfies Assumptions^and\3\ then there exists a choice of N such 
that Q satisfies Assumptions\l\and\2\ 
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Proof. Fix some arbitrary R with i? > 26/(1 — 7). Our aim is to show that we can find 
> 0, a probability measure v and a constant a > such that Q{x, ■) > aiy{-) for every 
X with V{x) < R. 

Iterating (|5]), we find that one has the bound 

n 

1 < < 7"+V + 6^7'=7'""'=ls , (6) 

A;=0 

SO that, on the set Sn — {x : V{x) < 7^"^^/2}, one has the lower bound 

n 

^i„fJ^^'c^"-(,,5)>_. (7) 

fe=0 

In particular this implies that for every x G X, there exists n such that V^ix, S) > 0. 
Combining this with our two assumptions shows that J V{x)i>{dx) ~ C < 00 so that, 
integrating (|6| with respect to i>, we obtain 

n 

1 < C7"+i + bY,^'' (^""''^) (5') . 
fc=o 

Choosing n sufficiently large then implies the existence of some ^ > such that {V^~^v) [S) > 
0. Combining this with Assumption[3]shows that there exists a > such that P^i) > av. 
Setting now v — j X]fc=o T^^^^ it follows that one has the bound 

^ ^ £-1 

= £ E + 7^'^ ^ i E + 7^ > «^ ■ 

k=l k=l 

In particular, this implies that for every m > 1 there exists a constant a„i such that the 
lower bound 

m+e 

inf V7''^(a;,.)>amK-) (8) 

k—rn 

holds. Let now n be sufficiently large such that 7~"^^/2 > R and set N ~ n + 1 + £. 
Combining ^ and ^ then yields the desired result. □ 

Remark 3.5. Keeping track of the constants appearing in the proof of the previous result, 
we see that one can choose for example any integer N such that 

iV> l + log(^^ J Vix) i>{dx)^ / \og^ . 
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